39 research outputs found

    Opening the Black Box: Analysing MLP Functionality Using Walsh Functions

    Get PDF
    The Multilayer Perceptron (MLP) is a neural network architecture that is widely used for regression, classification and time series forecasting. One often cited disadvantage of the MLP, however, is the difficulty associated with human understanding of a particular MLP’s function. This so called black box limitation is due to the fact that the weights of the network reveal little about structure of the function they implement. This paper proposes a method for understanding the structure of the function learned by MLPs that model functions of the class f:{−1,1}^n->R. This includes regression and classification models. A Walsh decomposition of the function implemented by a trained MLP is performed and the coefficients analysed. The advantage of a Walsh decomposition is that it explicitly separates the contribution to the function made by each subset of input neurons. It also allows networks to be compared in terms of their structure and complexity. The method is demonstrated on some small toy functions and on the larger problem of the MNIST handwritten digit classification data set

    The Perils of Ignoring Data Suitability: The Suitability of Data Used to Train Neural Networks Deserves More Attention

    Get PDF
    The quality and quantity (we call it suitability from now on) of data that are used for a machine learning task are as important as the capability of the machine learning algorithm itself. Yet these two aspects of machine learning are not given equal weight by the data mining, machine learning and neural computing communities. Data suitability is largely ignored compared to the effort expended on learning algorithm development. This position paper argues that some of the new algorithms and many of the tweaks to existing algorithms would be unnecessary if the data going into them were properly pre-processed, and calls for a shift in effort towards data suitability assessment and correction

    Mixed Order Hyper-Networks for Function Approximation and Optimisation

    Get PDF
    Many systems take inputs, which can be measured and sometimes controlled, and outputs, which can also be measured and which depend on the inputs. Taking numerous measurements from such systems produces data, which may be used to either model the system with the goal of predicting the output associated with a given input (function approximation, or regression) or of finding the input settings required to produce a desired output (optimisation, or search). Approximating or optimising a function is central to the field of computational intelligence. There are many existing methods for performing regression and optimisation based on samples of data but they all have limitations. Multi layer perceptrons (MLPs) are universal approximators, but they suffer from the black box problem, which means their structure and the function they implement is opaque to the user. They also suffer from a propensity to become trapped in local minima or large plateaux in the error function during learning. A regression method with a structure that allows models to be compared, human knowledge to be extracted, optimisation searches to be guided and model complexity to be controlled is desirable. This thesis presents such as method. This thesis presents a single framework for both regression and optimisation: the mixed order hyper network (MOHN). A MOHN implements a function f:{-1,1}^n ->R to arbitrary precision. The structure of a MOHN makes the ways in which input variables interact to determine the function output explicit, which allows human insights and complexity control that are very difficult in neural networks with hidden units. The explicit structure representation also allows efficient algorithms for searching for an input pattern that leads to a desired output. A number of learning rules for estimating the weights based on a sample of data are presented along with a heuristic method for choosing which connections to include in a model. Several methods for searching a MOHN for inputs that lead to a desired output are compared. Experiments compare a MOHN to an MLP on regression tasks. The MOHN is found to achieve a comparable level of accuracy to an MLP but suffers less from local minima in the error function and shows less variance across multiple training trials. It is also easier to interpret and combine from an ensemble. The trade-off between the fit of a model to its training data and that to an independent set of test data is shown to be easier to control in a MOHN than an MLP. A MOHN is also compared to a number of existing optimisation methods including those using estimation of distribution algorithms, genetic algorithms and simulated annealing. The MOHN is able to find optimal solutions in far fewer function evaluations than these methods on tasks selected from the literature

    Mixed order associative networks for function approximation, optimisation and sampling

    Get PDF
    A mixed order associative neural network with n neurons and a modified Hebbian learning rule can learn any functionf : {-1,1}n → R  and reproduce its output as the network's energy function. The network weights are equal to Walsh coecients, the fixed point attractors are local maxima in the function, and partial sums across the weights of the network calculate averages for hyperplanes through the function. If the network is trained on data sampled from a distribution, then marginal and conditional probability calculations may be made and samples from the distribution generated from the network. These qualities make the network ideal for optimisation fitness function modelling and make the relationships amongst variables explicit in a way that architectures such as the MLP do not

    Learning and Searching Pseudo-Boolean Surrogate Functions from Small Samples

    Get PDF
    When searching for input configurations that optimise the output of a system, it can be useful to build a statistical model of the system being optimised. This is done in approaches such as surrogate model-based optimisation, estimation of distribution algorithms and linkage learning algorithms. This paper presents a method for modelling pseudo-Boolean fitness functions using Walsh bases and an algorithm designed to discover the non-zero coefficients while attempting to minimise the number of fitness function evaluations required. The resulting models reveal linkage structure that can be used to guide a search of the model efficiently. It presents experimental results solving benchmark problems in fewer fitness function evaluations than those reported in the literature for other search methods such as EDAs and linkage learners

    A Suite of Incremental Image Degradation Operators for Testing Image Classification Algorithms

    Get PDF
    Convolutional Neural Networks (CNN) are extremely popular for modelling sound and images, but they suffer from a lack of robustness that could threaten their usefulness in applications where reliability is important. Recent studies have shown how it is possible to maliciously create adversarial images—those that appear to the human observer as perfect examples of one class but that fool a CNN into assigning them to a different, incorrect class. It takes some effort to make these images as they need to be designed specifically to fool a given network. In this paper we show that images can be degraded in a number of simple ways that do not need careful design and that would not affect the ability of a human observer, but which cause severe deterioration in the performance of three different CNN models. We call the speed of the deterioration in performance due to incremental degradations in image quality the degradation profile of a model and argue that reporting the degradation profile is as important as reporting performance on clean images

    A Haptic Interface for Guiding People with Visual Impairment using Three Dimensional Computer Vision

    Get PDF
    Computer vision technology has the potential to provide life changing assistance to blind or visually impaired (BVI) people. This paper presents a technique for locating objects in three dimensions and guiding a person’s hand to the object. Computer vision algorithms are used to locate both objects of interest and the user’s hand. Their relative locations are used to calculate the movement required to take the hand closer to the object. The required direction is signaled to the user via a haptic wrist band, which consists of four haptic motors worn at the four compass points on the wrist. Guidance works both in two and three dimensions, making use of both colour and depth map inputs from a camera. User testing found that people were able to follow the haptic instructions and move their hand to locations on vertical or horizontal surfaces. This work is part of the Artificial Intelligence Sight Loss Assistant (AISLA) project

    Learning Spatial Relations with a Standard Convolutional Neural Network

    Get PDF
    This paper shows how a standard convolutional neural network (CNN) without recurrent connections is able to learn general spatial relationships between different objects in an image. A dataset was constructed by placing objects from the Fashion-MNIST dataset onto a larger canvas in various relational locations (for example, trousers left of a shirt, both above a bag). CNNs were trained to name the objects and their spatial relationship. Models were trained to perform two different types of task. The first was to name the objects and their relationships and the second was to answer relational questions such as ``Where is the shoe in relation to the bag?". The models performed at above 80\% accuracy on test data. The models were also capable of generalising to spatial combinations that had been intentionally excluded from the training data

    GMM-IL: Image Classification Using Incrementally Learnt, Independent Probabilistic Models for Small Sample Sizes

    Get PDF
    When deep-learning classifiers try to learn new classes through supervised learning, they exhibit catastrophic forgetting issues. In this paper we propose the Gaussian Mixture Model - Incremental Learner (GMM-IL), a novel two-stage architecture that couples unsupervised visual feature learning with supervised probabilistic models to represent each class. The key novelty of GMM-IL is that each class is learnt independently of the other classes. New classes can be incrementally learnt using a small set of annotated images with no requirement to relearn data from existing classes. This enables the incremental addition of classes to a model, that can be indexed by visual features and reasoned over based on perception. Using Gaussian Mixture Models to represent the independent classes, we outperform a benchmark of an equivalent network with a Softmax head, obtaining increased accuracy for sample sizes smaller than 12 and increased weighted F1 score for 3 imbalanced class profiles in that sample range. This novel method enables new classes to be added to a system with only access to a few annotated images of the new class

    Spatio-temporal evaluation of social media as a tool for livestock disease surveillance

    Get PDF
    Recent outbreaks of Avian Influenza across Europe have highlighted the potential for syndromic surveillance systems that consider other modes of data, namely social media. This study investigates the feasibility of using social media, primarily Twitter, to monitor illness outbreaks such as avian flu. Using temporal, geographical, and correlation analyses, we investigated the association between avian influenza tweets and officially verified cases in the United Kingdom in 2021 and 2022. Pearson correlation coefficient, bivariate Moran's I analysis and time series analysis, were among the methodologies used. The findings show a weak, statistically insignificant relationship between the number of tweets and confirmed cases in a temporal context, implying that relying simply on social media data for surveillance may be insufficient. The spatial analysis provided insights into the overlaps between confirmed cases and tweet locations, shedding light on regionally targeted interventions during outbreaks. Although social media can be useful for understanding public sentiment and concerns during outbreaks, it must be combined with traditional surveillance methods and official data sources for a more accurate and comprehensive approach. Improved data mining techniques and real-time analysis can improve outbreak detection and response even further. This study underscores the need of having a strong surveillance system in place to properly monitor and manage disease outbreaks and protect public health.</p
    corecore